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1.  INTRODUCTION 

The  development  of  the  Fast  Fourier  Transform  (FFT)  led  to  the  realization  of 
a  number  of  FFT  applications  to  speech  research.  Chief  among  these  are  the 
cepstrum  pitch  detector  for  automatic  detection  of  the  pitch  period  of  voiced 
speech,  the  cepstrally  smoothed  log  spectrum  for  formant  selection,  and  the 
Chirp  Z-transform  for  narrow-band  frequency  analysis.  Each  of  these  applications 
is  discussed  in  the  following  sections  and  references  are  given  for  more  detailed 
presentations.  In  addition,  complete  FORTRAN  codes  and  examples  accompany  the 
discussions. 

2.  CEPSTRUM  PITCH  DETECTOR 

In  1967,  Noll  [1]  described  a  method  for  computing  the  pitch  period  of  voiced 
speech  using  the  cepstrum.  The  cepstrum  is  defined  as  the  square  of  the  Fourier 
transform  of  the  logarithm  power  spectrum  and  has  a  strong  peak  corresponding 
to  the  pitch  period  of  the  voiced-speech  segment  being  analyzed.  Schafer  and 
Rabiner  [2]  have  also  described  this  procedure  in  a  way  which  allows  the  com¬ 
puter  code  to  be  easily  constructed.  The  FORTRAN  program  listing  is  shown  in 
Figure  1.  The  input  data  are  the  digitized  speech  sample  S(I),  the  number  N 
of  samples  S(I)  (which  must  be  compatible  with  the  FFT  algorithm  used),  the 
sampling  period  T  (in  sec.),  and  a  rough  estimate  of  the  pitch  period  PP.  The 
latter  parameter  is  used  to  establish  the  length  of  the  Hamming  window  (viz., 
4*PP)  through  which  the  signal  S(I)  is  passed. 

As  an  example,  a  truncated  Fourier  series 
a  “ 

f(t)  -  —•  +  2^  (ajccos2irkwt  +  b^sin2Trkwt) 
k-1 
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SUBROUTINE  CEPS<  S#N#  T#C#PP#X) 

CEPSTRUM  CALCULATION 

REFERENCE*  R.  V»  SCHAFER  A  L.  R*  RABINER# 

"SYSTB4  FOR  AUTOMATIC  FORMANT  ANALYSIS  OF  VOICED  SPEECH# 
J.  ACOUST.  SOC.  AMER.  47#  634  -  648  (19  70)* 

S<I>  -  DIGITIZED  SPEECH  SAMPLE 
T  -  SAMPLING  PERIOD  <IN  SECO 
C<I>  »  CEPSTRUM  OF  S<  I  > 

PP  »  PITCH  PERIOD  ESTIMATE 
X<I>  -  COMPLEX  CEPSTRUM 
DIMENSION  S<N)#C<N)#X<N> 

COMPLEX  X 

MULTIPLY  SAMPLE  BY  HAMMING  WINDOW 
XN-N 

TWOPI- 6.  28318531 7178 
DURWI N«  4. *PP 
DO  17  I*  1#N 
XI-I-1 

IF  (XI* T- DURWI N)  10# 10# 15 
10  ARG-TWOPI*XI*T/DURWIN 

S<  I  )»S<  I  )*<  .  54-.  46*C0S<ARG)  ) 

GO  TO  17 
15  S<D»0. 

17  CONTINUE 

DO  BO  I ■ 1#N 

20  X<I)»CMPLX<S<I)#0O 

FIND  M  SUCH  THAT  N  ■  2**M 
K“N 
1*0 

30  K-K/2 

I»I  +  l 

IF  <K-1>  40#  40#  30 
40  M-I 

CALL  FFT<X#M#N#-1> 

DO  50  1“  1#N 
AaREAL<X< I)) 

B<<AIMAG<X<  I )  ) 

S<  I  >■  1 0. *AL0G10<  A*A+B*B> 

50  X< I >«CMPLX< S< I )#0* ) 

CALL  FFT<X#M#N#  1) 

DO  60  I«1#N 
X<  I )*X< I )/XN 
60  C<I)«REAL<X< I >) 

RETURN 

END 


M 


Figure  1.  Cepstrum  Pitch  Detector  Program  Listing 
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was  generated  with  the  following  parameters: 


m  =  5  ,  a)  =  500 

a  =  0 
o 


al 

*■  0 

bl  = 1 

a2 

=  0 

b2  =  1/2 

a3 

=  0 

b3=  1 

a4 

=  0 

b4  =  0 

a5 

=  0 

b5  =  1/2, 

and  a  set  of  512  data  points  was  obtained  by  sampling  f(t)  at  t  =  (j-l)T 
(j  =  1,2,  . 512)  for  T  =  .0001  (10,000  samples/sec.)  A  pitch  period 
estimate  of  5  msec,  was  used.  The  resulting  cepstrum  is  shown  in  Figure  2. 
Note  that  a  strong  peak  occurs  at  t  =  2  msec.,  the  actual  pitch  period  of  the 
signal . 


3.  CEPSTRALLY  SMOOTHED  LOG  SPECTRUM 

As  shown  by  Schafer  and  Rabiner  [2],  the  spectral  envelope  can  be  obtained  by 
cepstrally  smoothing  the  log  spectrum.  This  smoothing  is  accomplished  by  low- 
pass  filtering  the  log  magnitude  of  the  DFT.  To  this  end,  the  cepstrum  is 
multiplied  by  a  low-time  filter  whose  cut-off  is  less  than  the  pitch  period, 
and  then  transformed  by  the  DFT  to  produce  the  smoothed  spectral  envelope. 

The  FORTRAN  program  listing  is  given  in  Figure  3. 

As  an  illustration,  the  unsmoothed  log  spectrum  of  the  512  data  points  of 
section  2  was  first  computed,  using  an  FFT  algorithm;  this  is  shown  in  Figure 

4.  Figure  5  shows  the  cepstrally  smoothed  log  spectrum. 
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SUBROUTINE  PITCHC  S#N#T#C#P# SPEC) 

ESTIMATION  OF  PITCH  PERIOD  AND  SPECTRAL  ENVELOPE 

REFERENCE!  R.  V.  SCHAFER  &  L*  R«  RABINER# 

•'SYSTEM  FOR  AUTOMATIC  FORMANT  ANALYSIS  OF  VOICED  SPEECH# ** 

J.  ACOUST.  SOC.  AMER.  47#  634  -  648  (1970)* 

SCI)  -  DIGITIZED  SPEECH  SAMPLE 
N  -  NUMBER  OF  VALUES  SCI) 

T  -  SAMPLING  PERIOD  C  IN  SEC.  ) 

P  »  PITCH  PERIOD 

SPECCI)  -  CEPSTRALLY  SMOOTHED  LOG  SPECTRUM  C  SPECTRAL  ENVELOPE  ) 

DIMENSION  SC N ) # SPEC C N ) # CC N ) 

COMPLEX  CCC512) 

CALL  CEPSC  S#N# T#  C#P# CC) 

TAUMIN  -  MINIMUM  EXPECTED  PITCH  PERIOD 
TAUMIN- *001 

TAUMAX  -  MAXIMUM  EXPECTED  PITCH  PERIOD 
TAUMAX— 020 

SEARCH  THE  CEPSTRUM  FOR  A  STRONG  PEAK  IN  THE  REGION 
TAUMIN  <  J*T  <  TAUMAX 


JMIN-TAUMIN/T+1. 

JMAX“ TAUMAX /T+ 1 • 

PEAK-CC JMIN) 

DO  30  I-JMIN# JMAX 
10  IF  CPEAK-CCI))  20# 20# 30 
20  PEAK-CC  I) 

INDEX-1 
30  CONTINUE 

P-FLOATC  INDEX- 1)*T 

LOW  -  PASS  FILTER  THE  LOG  MAGNITUDE  OF  THE  DFT 


i 


Figure  3.  Cepstrally  Smoothed  Log  Spectrum  Program  Listing 
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PI-3.  141 598653569 

Tl-.6*P 

DT-. 1*P 

TlPDT-Tl+DT 

DO  80  I-l#N 

XI-I-l 

IF  <XI*T-T1>  80# 40# 40 
40  IF  <XI*T-TIPDT>  50#  60#  60 
50  CC< I >-CC( I >*• 5*< l.+COS<PI*<XI*T-Tl>/DT>> 

GO  TO  80 

60  CC( I )-CMPLX(0. #0*  > 

80  CONTINUE 

FIND  M  SUCH  THAT  N  -  2**M 
K-N 
1-0 
90  K-K/fi 
I-I  +  l 

IF  <K“l>  100* 100# 90 
100  M-I 

CALL  FFT<CC#M#N#-1> 

DO  110  I-1#N 

110  SPECC I )-2.*REAL< CC< I ) ) 

ADD  '.THE  SPECTRUM  EQUALIZING  CURVE  TC  THE  SMOOTHED  SPECTRAL 
ENVELOPE 

DO  140  K*l ?N 

W- FLO  AT<  K- 1 )  /  <  FLO  AT<  N  )  *  T) 

IF  < W- 300C* )  120# 120# 130 
120  SPECC K)-SPEC<K>- 10. ♦COS< PI* W/ 3000* ) 

GO  TO  140 

130  SPEC<K)«SPEC<K)+10. 

140  CONTINUE 
C 

RETURN 

END 


(Cont'd)  Figure  3.  Cepstrally  Smoothed  Log  Spectrum  Program  Listing 


Figure  4.  Unsmoothed  Log  Spectrum  Example 
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4.  CHIRP  Z-TRANSFORM 

The  Chirp  Z-transform  (CZT)  is  a  computational  algorithm  for  efficiently 
evaluating  the  Z-transform  of  a  sequence  of  n  samples  at  m  points  in  the  Z-plane 
which  lie  on  circular  or  spiral  contours  beginning  at  an  arbitrary  point.  Its 
importance  to  speech  analysis  stems  from  its  ability  to  efficiently  evaluate 
the  Z-transform  off  the  unit  circle,  the  required  contour  for  evaluation  of 
the  DFT.  In  this  way,  the  rontour  can  be  made  to  pass  closer  to  the  poles  and 
zeros  of  the  system,  reducing  the  bandwidths  and  sharpening  the  transfer 
function.  A  detailed  description  is  given  in  references  [2]  and  [3]. 

The  FORTRAN  program  listing  of  the  CZT  is  shown  in  Figure  6.  Subroutine  CZT 
must  be  used  in  conjunction  with  three  other  subroutines:  FFT,  CONTOR,  and 
POWER.  The  FFT  used  in  this  case  is  algorithm  I  of  SDC  TM-4857/100/00,  "A 
Comparison  of  FFT  algorithms".  Subroutine  CONTOR  establishes  an  appropriate 
contour  over  which  to  evaluate  the  CZT.  Figure  7  is  a  listing  of  CONTOR  for 
establishing  parameters  for  a  circle  of  radius  .95.  Subroutine  POWER,  given 
in  Figure  8,  is  used  to  replace  the  computation  of  high  powers  of  the  ex¬ 
ponential  function  by  iterated  multiplications.  This  has  yielded  slightly 
more  stability  in  the  code.  However,  even  in  its  double-precision  form  (Fig¬ 
ure  9),  the  CZT  has  been  found  to  be  unstable  for  as  few  as  16  data  points. 


onoo  o o o o o o o o o o no o 
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( 

\ _ 1 


SUBROUTINE  CZTCN*X*M*A* W> 

CHIRP  Z- TRANS FORM 

REFERENCES  L»R*  RABINER*  R.li.  SCHAFER  A  C*  M.  RADER* 

"  THE  CHIRP  Z-TRANSFOHM  AND  ITS  APPLICATION  • 

BELL  SYSTEM  TECH*  J.  48*  1249-1898  (1969)* 

N  -  NUMBER  OF  INPUT  DA'iA  POINTS  XCI) 

X  «  INPUT  ARRAY  OF  COMPLEX  DATA  POINTS 
M  -  NUMBER  OF  OUTPUT  POINTS  DESIRED 
A* V  «  PARAMETERS  DETERMINING  THE  CONTOUR  OVER  WICH 
THE  CZT  IS  TO  BE  EVALUATED  < SEE  LISTING  OF  SUBROUTINE  CONTOR) 

COMPLEX  X<  512)*  A*  V*  V<  512) 

COMPLEX  C*  D*  PVR 

FIND  THE  SMALLEST  POVER  OF  TWO  WHICH  IS  GREATER  THAN  OR  EQUAL 
TO  N+M-l 

K-N+M-l  (  ) 

1-1 

5  K-K/2 

IF  CX-1)  10*20*10 
10  I-I+l 

GO  TO  5 

80  L*8**< I  +  l) 

DO  30  J*1*N 
XJ-J- 1 

CALL  POVERC  W*X J/2*  *  PWR) 

C-PWR/A 

CALL  POVERC C*XJ* PVR) 

30  XCJ)»PVR*XCJ) 

DO  40  J"N+1*L 


Figure  6.  Chirp  Z-Transform  Program  Listing 
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40  XC  J)«CMPLXC0.#0.) 

CALL  FFT<X#I  +  1#-1> 

DO  50  J-1#M 
XJ«*J-1 

CALL  POVERC V# -XJ*XJ/2«#PWR> 

50  V<J)«PWR 

IF  CL“<N+M“1)  )  55#  68#  55 
55  DO  60  J*M+1#L-N+1 
60  V< J)=CMPLX<0.#0e) 

68  XLbL 

DO  70  J»L-N+2#L 
XJ-J-1 

CALL  POWER* W#-<XL-XJ)*<XL-XJ?/2.#PWR> 
70  V<J)«PVR 

CALL  FFT<  V#I**1#“1> 

DO  80  J«1#L 
80  V<J)«V< J)*X< J> 

CALL  FFTC V#I+1# 1) 

DO  90  J»1#L 
90  V<J>»V<J)/XL 
DO  100  J«1#M 
XJ-J-l 

CALL  POWER*  V#X J*X J/2.  # PVR) 

D«PWR 

100  X*J)«V<J>*D 

RETURN 
END 


(Cont'd)  Figure  6.  Chirp  Z-Transform  Program  Listing 
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SUBROUTINE  CONTORCNPTS#  T#  BY*  BV, H« AO#  TOO#  WO# PHI 0) 

COMPUTES  AN  APPROPRIATE  CONTOUR  ON  WHICH  TO  EVALUATE  TOE  CZT 

INPUTS  NPTS  -  NUMBER  OF  VALUES  OF  CEPSTRUN  TO  BE  USED 
T  -  SAMPLING  PERIOD  (IN  SEC*) 

BF  -  LOWEST  FREQUENCY  OF  REGION  OVER  WICK  TOE 
NARROW* BAND  FREQUENCY  ANALYSIS  IS  TO  BE  PERFORMED 
BW  -  BANDWIDTH  OF  REGION 

OUTPUTS  M  -  NUMBER  OF  OUTPUT  VALUES  TO  BE  COMPUTED  BY  CZT 
AO  -  RADIUS  OF  CONTOUR 

THOS  CENTERS  TOE  ANALYSIS  ON  TOE  FREQUENCY  REGION 
OF  INTEREST 

VOs  TAKEN  HERE  TO  BE  -  1  TO  MAKE  TOE  CONTOUR  AN  ARC 
OF  A  CIRCLE 

PHI Os  DETERMINES  TOE  FREQUENCY  SPACING  OF  THE  SPECTRAL 
SAMPLES 

TVOPI-6*  28318  5317178 
A0«EXP<-*0314) 

THO-BF*T/TWOPI 

VO-1* 

PHI 0-BW*T/<  FLOATCM- 1 >*TVOPI ) 

RETURN 

END 


Figure  7.  Subroutine  CONTOR  Program  Listing 
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SUBROUTINE  POWERCZ# A* PVR) 

COMPUTES  Z**A  FOR  LARGE  VALUES  OF  A 

DOUBLE  PRECISION  Z#  V#P#X#PVR#A#R 
IF  CA-20*  >  5#5#8 
5  PVR-Z++A 
RETURN 

REPRESENT  A  AS  A  ■  10*N  +  R#  WHERE  R  IS  LESS  THAN  10 
B  N-A/10. 

R-A-10.*FL0ATCN) 

W-Z**10 

P«Z**R 

DETERMINE  WETHER  N  IS  EVEN  OR  ODD< 

IF  EVEN#  FIND  K  SUCH  THAT  N  -  2*K 
IF  ODD#  FIND  K  SUCH  THAT  N  -  2*K  +  1 

K-N/2 

M-M0D<N#2) 

X-V 

IF  CK.EQ.l)  GO  TO  25 
DO  20  Z-1#K~1 

25  PWR»W*W*P 

IF  CM)  40# 40# 30 
30  PVR-PVR4X 
40  RETURN 
END 


Figure  8.  Subroutine  POWER  Program  Listing 
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(  ) 


SUBROUT X NE  CZTCXRjX I#N# K;  AO*  THO*  WO*PHO> 

DIMENSION  YRC  128>*YIC  128>*YRS<  188>*YIS<  188) 
DIMENSION  VRC  128)# VIC  128>*  VRSC  1 88  >*  VI  SC  128) 
DIMENSION  GRCt88>*GIC188)*XRC186>*XIC186> 

DOUBLE  PRECISION  YR*YI*PHO* VR* VI 
DOUBLE  PRECISION  FJ#XP* AO* VO*  PVR*  PR*  PI*  TVOPI*  ARG 
PI-3.14159265 
TVOPI -6*  88  3 16  53 
K-N+M- 1 
1-1 

5  K-K/2 

IF  CK- 1 )  1C*£0*  10 

10  X-I  +  l 

GO  TO  S 

SO  L-8**CI+1> 

DO  30  J«1*N 
FJ-J- 1 
XP-FJ4FJ/8* 

CALL  POVERC  AO* -FJ*  PVR) 

CALL  POVERC  VO*XP*PR) 

ARG-PI*FJ*C FJ*PH0-8.*TH0) 

ARG-ENODC  AF**  TVOPI > 

YRC  J  >  «P  VR*PR*XRC  J)*DCOSC  AR6> 

YIC  J)»PVR*PR*XRCJ)*DSINCARG) 

30  CONTINUE 

DO  40  J-N+1*L 
YRCJ>-0. 

YIC J)«0* 

40  CONTINUE 

DO  45  J-1*L 
YRSC  JJ-SNGLCYRC  J>> 

YISC  J)-SNGLCYIC  J)  > 

45  CONTINUE 

CALL  FFTC  YRS*YIS*  I  +  1*L* - 1 > 

DO  50  J-1*M 
FJ-J- l 

XP—  FJ*FJ/2. 

CALL  POVERC  VO*XP*PVR) 

ARG-PX  *  F J*  F  J*PHO 
ARG-DMODC  ARG*  TVOPI  > 

VRC J)-PVR4DC0SC ARG) 

VI C  J>— PVR4DSINC  ARG) 
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SO  CONTINUE 

IF  CL-CN+M-l))  55#  68#  55 
55  DO  60  J*M+1#L-N+1 
VRCJ>*0. 

60  VICJ>-0. 

68  FL-L 

DO  70  J*L-N+8#L 
FJ*  J- 1 

XP»-CFL-FJ)*C  FL-FJ5/2. 

CALL  POVERC  VO#XP#PR) 

ARG-PX  *  C - 8 . *XP ) *PHO 
ARG*DNODC ARG# TVOPX ) 

VRC J>-PR*DCOSCARG> 

VI  C  J) — PR*  DS I NC  ARG  ) 

70  CONTINUE 

DO  75  J*1#L 
VRS<  J)-SNGL(VRCJ)) 

75  VISCJ>*SNGLCVICJ>> 

CALL  FFTCVRS#VIS#I  +  1#L#-1> 

DO  80  J*1#L 

GRC  J)*VRSCJ)*YRSCJ)-VISC  J)*YISCJ> 

80  GI C  J)*VRSC J)*YI  SC  J)+VI  SC  J)*YRSC  J) 

CALL  FFTC GR#  GI#  1  +  1#L#  1  > 

DO  90  J*1#L 
GRCJ)*GRCJ)/FL 
90  GIC J)-GICJ)/FL 
DO  100  J-1#M 
FJ* J- 1 
XP-FJ*FJ/8. 

CALL  POVERC VO#XP#PR> 

ARG»PI*FJ*FJ*PHO 
ARG-DNODC ARG#  TVOPI > 

XRC  J  ) -PR*  C  GRC  J  )  *  DCO  S  C  ARG  > - GI  C  J>*  DS  I  NC  ARG  )  ) 
XIC  J)*PR*CGIC  J)*DCOSC  ARG)+GRC  J)*DSINC  ARG)  > 
100  CONTINUE 
RETURN 
BID 
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